Overview

Dataset statistics

Number of variables21
Number of observations1460
Missing cells207
Missing cells (%)0.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory751.0 KiB
Average record size in memory526.7 B

Variable types

Categorical6
Numeric14
Boolean1

Alerts

OverallQual is highly correlated with YearBuilt and 6 other fieldsHigh correlation
YearBuilt is highly correlated with OverallQual and 5 other fieldsHigh correlation
TotalBsmtSF is highly correlated with 1stFlrSF and 1 other fieldsHigh correlation
1stFlrSF is highly correlated with TotalBsmtSF and 1 other fieldsHigh correlation
2ndFlrSF is highly correlated with GrLivArea and 1 other fieldsHigh correlation
GrLivArea is highly correlated with OverallQual and 5 other fieldsHigh correlation
FullBath is highly correlated with OverallQual and 6 other fieldsHigh correlation
TotRmsAbvGrd is highly correlated with 2ndFlrSF and 3 other fieldsHigh correlation
GarageYrBlt is highly correlated with OverallQual and 5 other fieldsHigh correlation
GarageCars is highly correlated with OverallQual and 6 other fieldsHigh correlation
GarageArea is highly correlated with OverallQual and 4 other fieldsHigh correlation
SalePrice is highly correlated with OverallQual and 9 other fieldsHigh correlation
OverallQual is highly correlated with YearBuilt and 7 other fieldsHigh correlation
YearBuilt is highly correlated with OverallQual and 3 other fieldsHigh correlation
BsmtFinSF1 is highly correlated with TotalBsmtSFHigh correlation
TotalBsmtSF is highly correlated with OverallQual and 3 other fieldsHigh correlation
1stFlrSF is highly correlated with TotalBsmtSF and 2 other fieldsHigh correlation
2ndFlrSF is highly correlated with GrLivArea and 1 other fieldsHigh correlation
GrLivArea is highly correlated with OverallQual and 5 other fieldsHigh correlation
FullBath is highly correlated with OverallQual and 3 other fieldsHigh correlation
TotRmsAbvGrd is highly correlated with 2ndFlrSF and 3 other fieldsHigh correlation
GarageYrBlt is highly correlated with OverallQual and 3 other fieldsHigh correlation
GarageCars is highly correlated with OverallQual and 4 other fieldsHigh correlation
GarageArea is highly correlated with OverallQual and 3 other fieldsHigh correlation
SalePrice is highly correlated with OverallQual and 8 other fieldsHigh correlation
OverallQual is highly correlated with YearBuilt and 3 other fieldsHigh correlation
YearBuilt is highly correlated with OverallQual and 1 other fieldsHigh correlation
TotalBsmtSF is highly correlated with 1stFlrSFHigh correlation
1stFlrSF is highly correlated with TotalBsmtSFHigh correlation
2ndFlrSF is highly correlated with GrLivAreaHigh correlation
GrLivArea is highly correlated with 2ndFlrSF and 3 other fieldsHigh correlation
FullBath is highly correlated with OverallQual and 2 other fieldsHigh correlation
TotRmsAbvGrd is highly correlated with GrLivAreaHigh correlation
GarageYrBlt is highly correlated with YearBuilt and 1 other fieldsHigh correlation
GarageCars is highly correlated with OverallQual and 3 other fieldsHigh correlation
GarageArea is highly correlated with GarageCarsHigh correlation
SalePrice is highly correlated with OverallQual and 3 other fieldsHigh correlation
ExterQual is highly correlated with KitchenQualHigh correlation
KitchenQual is highly correlated with ExterQualHigh correlation
Neighborhood is highly correlated with BsmtQualHigh correlation
BsmtQual is highly correlated with NeighborhoodHigh correlation
Neighborhood is highly correlated with OverallQual and 14 other fieldsHigh correlation
OverallQual is highly correlated with Neighborhood and 15 other fieldsHigh correlation
YearBuilt is highly correlated with Neighborhood and 14 other fieldsHigh correlation
MasVnrArea is highly correlated with OverallQual and 3 other fieldsHigh correlation
ExterQual is highly correlated with Neighborhood and 9 other fieldsHigh correlation
BsmtQual is highly correlated with Neighborhood and 9 other fieldsHigh correlation
BsmtFinSF1 is highly correlated with TotalBsmtSF and 4 other fieldsHigh correlation
TotalBsmtSF is highly correlated with OverallQual and 5 other fieldsHigh correlation
HeatingQC is highly correlated with Neighborhood and 4 other fieldsHigh correlation
CentralAir is highly correlated with YearBuilt and 2 other fieldsHigh correlation
1stFlrSF is highly correlated with Neighborhood and 9 other fieldsHigh correlation
2ndFlrSF is highly correlated with Neighborhood and 7 other fieldsHigh correlation
GrLivArea is highly correlated with Neighborhood and 12 other fieldsHigh correlation
FullBath is highly correlated with Neighborhood and 11 other fieldsHigh correlation
KitchenQual is highly correlated with Neighborhood and 10 other fieldsHigh correlation
TotRmsAbvGrd is highly correlated with OverallQual and 5 other fieldsHigh correlation
GarageYrBlt is highly correlated with Neighborhood and 9 other fieldsHigh correlation
GarageFinish is highly correlated with Neighborhood and 5 other fieldsHigh correlation
GarageCars is highly correlated with Neighborhood and 7 other fieldsHigh correlation
GarageArea is highly correlated with Neighborhood and 12 other fieldsHigh correlation
SalePrice is highly correlated with Neighborhood and 19 other fieldsHigh correlation
BsmtQual has 37 (2.5%) missing values Missing
GarageYrBlt has 81 (5.5%) missing values Missing
GarageFinish has 81 (5.5%) missing values Missing
MasVnrArea has 861 (59.0%) zeros Zeros
BsmtFinSF1 has 467 (32.0%) zeros Zeros
TotalBsmtSF has 37 (2.5%) zeros Zeros
2ndFlrSF has 829 (56.8%) zeros Zeros
GarageCars has 81 (5.5%) zeros Zeros
GarageArea has 81 (5.5%) zeros Zeros

Reproduction

Analysis started2022-04-24 01:13:35.689954
Analysis finished2022-04-24 01:13:55.364543
Duration19.67 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Neighborhood
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct25
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size90.7 KiB
NAmes
225 
CollgCr
150 
OldTown
113 
Edwards
100 
Somerst
86 
Other values (20)
786 

Length

Max length7
Median length7
Mean length6.494520548
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCollgCr
2nd rowVeenker
3rd rowCollgCr
4th rowCrawfor
5th rowNoRidge

Common Values

ValueCountFrequency (%)
NAmes225
15.4%
CollgCr150
 
10.3%
OldTown113
 
7.7%
Edwards100
 
6.8%
Somerst86
 
5.9%
Gilbert79
 
5.4%
NridgHt77
 
5.3%
Sawyer74
 
5.1%
NWAmes73
 
5.0%
SawyerW59
 
4.0%
Other values (15)424
29.0%

Length

2022-04-23T22:13:55.426179image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
names225
15.4%
collgcr150
 
10.3%
oldtown113
 
7.7%
edwards100
 
6.8%
somerst86
 
5.9%
gilbert79
 
5.4%
nridght77
 
5.3%
sawyer74
 
5.1%
nwames73
 
5.0%
sawyerw59
 
4.0%
Other values (15)424
29.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

OverallQual
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.099315068
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:55.506721image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q15
median6
Q37
95-th percentile8
Maximum10
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.382996547
Coefficient of variation (CV)0.2267462053
Kurtosis0.09629277836
Mean6.099315068
Median Absolute Deviation (MAD)1
Skewness0.2169439278
Sum8905
Variance1.912679448
MonotonicityNot monotonic
2022-04-23T22:13:55.578031image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
5397
27.2%
6374
25.6%
7319
21.8%
8168
11.5%
4116
 
7.9%
943
 
2.9%
320
 
1.4%
1018
 
1.2%
23
 
0.2%
12
 
0.1%
ValueCountFrequency (%)
12
 
0.1%
23
 
0.2%
320
 
1.4%
4116
 
7.9%
5397
27.2%
6374
25.6%
7319
21.8%
8168
11.5%
943
 
2.9%
1018
 
1.2%
ValueCountFrequency (%)
1018
 
1.2%
943
 
2.9%
8168
11.5%
7319
21.8%
6374
25.6%
5397
27.2%
4116
 
7.9%
320
 
1.4%
23
 
0.2%
12
 
0.1%

YearBuilt
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct112
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1971.267808
Minimum1872
Maximum2010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:55.667964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1872
5-th percentile1916
Q11954
median1973
Q32000
95-th percentile2007
Maximum2010
Range138
Interquartile range (IQR)46

Descriptive statistics

Standard deviation30.20290404
Coefficient of variation (CV)0.01532156307
Kurtosis-0.4395519416
Mean1971.267808
Median Absolute Deviation (MAD)25
Skewness-0.6134611725
Sum2878051
Variance912.2154126
MonotonicityNot monotonic
2022-04-23T22:13:55.770851image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
200667
 
4.6%
200564
 
4.4%
200454
 
3.7%
200749
 
3.4%
200345
 
3.1%
197633
 
2.3%
197732
 
2.2%
192030
 
2.1%
195926
 
1.8%
199825
 
1.7%
Other values (102)1035
70.9%
ValueCountFrequency (%)
18721
 
0.1%
18751
 
0.1%
18804
 
0.3%
18821
 
0.1%
18852
 
0.1%
18902
 
0.1%
18922
 
0.1%
18931
 
0.1%
18981
 
0.1%
190010
0.7%
ValueCountFrequency (%)
20101
 
0.1%
200918
 
1.2%
200823
 
1.6%
200749
3.4%
200667
4.6%
200564
4.4%
200454
3.7%
200345
3.1%
200223
 
1.6%
200120
 
1.4%

MasVnrArea
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct327
Distinct (%)22.5%
Missing8
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean103.6852617
Minimum0
Maximum1600
Zeros861
Zeros (%)59.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:55.876462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3166
95-th percentile456
Maximum1600
Range1600
Interquartile range (IQR)166

Descriptive statistics

Standard deviation181.0662066
Coefficient of variation (CV)1.746306115
Kurtosis10.08241732
Mean103.6852617
Median Absolute Deviation (MAD)0
Skewness2.66908421
Sum150551
Variance32784.97117
MonotonicityNot monotonic
2022-04-23T22:13:55.973637image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0861
59.0%
728
 
0.5%
1088
 
0.5%
1808
 
0.5%
1207
 
0.5%
167
 
0.5%
3406
 
0.4%
1066
 
0.4%
806
 
0.4%
2006
 
0.4%
Other values (317)529
36.2%
(Missing)8
 
0.5%
ValueCountFrequency (%)
0861
59.0%
12
 
0.1%
111
 
0.1%
141
 
0.1%
167
 
0.5%
182
 
0.1%
221
 
0.1%
241
 
0.1%
271
 
0.1%
281
 
0.1%
ValueCountFrequency (%)
16001
0.1%
13781
0.1%
11701
0.1%
11291
0.1%
11151
0.1%
10471
0.1%
10311
0.1%
9751
0.1%
9221
0.1%
9211
0.1%

ExterQual
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size84.2 KiB
TA
906 
Gd
488 
Ex
 
52
Fa
 
14

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGd
2nd rowTA
3rd rowGd
4th rowTA
5th rowGd

Common Values

ValueCountFrequency (%)
TA906
62.1%
Gd488
33.4%
Ex52
 
3.6%
Fa14
 
1.0%

Length

2022-04-23T22:13:56.065719image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-23T22:13:56.117030image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ta906
62.1%
gd488
33.4%
ex52
 
3.6%
fa14
 
1.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

BsmtQual
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4
Distinct (%)0.3%
Missing37
Missing (%)2.5%
Memory size83.3 KiB
TA
649 
Gd
618 
Ex
121 
Fa
 
35

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGd
2nd rowGd
3rd rowGd
4th rowTA
5th rowGd

Common Values

ValueCountFrequency (%)
TA649
44.5%
Gd618
42.3%
Ex121
 
8.3%
Fa35
 
2.4%
(Missing)37
 
2.5%

Length

2022-04-23T22:13:56.172023image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-23T22:13:56.222558image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ta649
45.6%
gd618
43.4%
ex121
 
8.5%
fa35
 
2.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

BsmtFinSF1
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct637
Distinct (%)43.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean443.639726
Minimum0
Maximum5644
Zeros467
Zeros (%)32.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:56.289169image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median383.5
Q3712.25
95-th percentile1274
Maximum5644
Range5644
Interquartile range (IQR)712.25

Descriptive statistics

Standard deviation456.0980908
Coefficient of variation (CV)1.028082167
Kurtosis11.11823629
Mean443.639726
Median Absolute Deviation (MAD)383.5
Skewness1.685503072
Sum647714
Variance208025.4685
MonotonicityNot monotonic
2022-04-23T22:13:56.381072image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0467
32.0%
2412
 
0.8%
169
 
0.6%
6865
 
0.3%
6625
 
0.3%
205
 
0.3%
9365
 
0.3%
6165
 
0.3%
5604
 
0.3%
5534
 
0.3%
Other values (627)939
64.3%
ValueCountFrequency (%)
0467
32.0%
21
 
0.1%
169
 
0.6%
205
 
0.3%
2412
 
0.8%
251
 
0.1%
271
 
0.1%
283
 
0.2%
331
 
0.1%
351
 
0.1%
ValueCountFrequency (%)
56441
0.1%
22601
0.1%
21881
0.1%
20961
0.1%
19041
0.1%
18801
0.1%
18101
0.1%
17671
0.1%
17211
0.1%
16961
0.1%

TotalBsmtSF
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct721
Distinct (%)49.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1057.429452
Minimum0
Maximum6110
Zeros37
Zeros (%)2.5%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:56.482768image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile519.3
Q1795.75
median991.5
Q31298.25
95-th percentile1753
Maximum6110
Range6110
Interquartile range (IQR)502.5

Descriptive statistics

Standard deviation438.7053245
Coefficient of variation (CV)0.4148790481
Kurtosis13.25048328
Mean1057.429452
Median Absolute Deviation (MAD)234.5
Skewness1.524254549
Sum1543847
Variance192462.3617
MonotonicityNot monotonic
2022-04-23T22:13:56.584690image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
037
 
2.5%
86435
 
2.4%
67217
 
1.2%
91215
 
1.0%
104014
 
1.0%
81613
 
0.9%
76812
 
0.8%
72812
 
0.8%
89411
 
0.8%
78011
 
0.8%
Other values (711)1283
87.9%
ValueCountFrequency (%)
037
2.5%
1051
 
0.1%
1901
 
0.1%
2643
 
0.2%
2701
 
0.1%
2901
 
0.1%
3191
 
0.1%
3601
 
0.1%
3721
 
0.1%
3847
 
0.5%
ValueCountFrequency (%)
61101
0.1%
32061
0.1%
32001
0.1%
31381
0.1%
30941
0.1%
26331
0.1%
25241
0.1%
24441
0.1%
23961
0.1%
23921
0.1%

HeatingQC
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size84.2 KiB
Ex
741 
TA
428 
Gd
241 
Fa
 
49
Po
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowEx
2nd rowEx
3rd rowEx
4th rowGd
5th rowEx

Common Values

ValueCountFrequency (%)
Ex741
50.8%
TA428
29.3%
Gd241
 
16.5%
Fa49
 
3.4%
Po1
 
0.1%

Length

2022-04-23T22:13:56.674985image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-23T22:13:56.728032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ex741
50.8%
ta428
29.3%
gd241
 
16.5%
fa49
 
3.4%
po1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

CentralAir
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
True
1365 
False
 
95
ValueCountFrequency (%)
True1365
93.5%
False95
 
6.5%
2022-04-23T22:13:56.766321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

1stFlrSF
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct753
Distinct (%)51.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1162.626712
Minimum334
Maximum4692
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:56.830903image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum334
5-th percentile672.95
Q1882
median1087
Q31391.25
95-th percentile1831.25
Maximum4692
Range4358
Interquartile range (IQR)509.25

Descriptive statistics

Standard deviation386.587738
Coefficient of variation (CV)0.3325123481
Kurtosis5.745841482
Mean1162.626712
Median Absolute Deviation (MAD)234.5
Skewness1.376756622
Sum1697435
Variance149450.0792
MonotonicityNot monotonic
2022-04-23T22:13:56.930812image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
86425
 
1.7%
104016
 
1.1%
91214
 
1.0%
89412
 
0.8%
84812
 
0.8%
67211
 
0.8%
6309
 
0.6%
8169
 
0.6%
4837
 
0.5%
9607
 
0.5%
Other values (743)1338
91.6%
ValueCountFrequency (%)
3341
 
0.1%
3721
 
0.1%
4381
 
0.1%
4801
 
0.1%
4837
0.5%
4951
 
0.1%
5205
0.3%
5251
 
0.1%
5261
 
0.1%
5361
 
0.1%
ValueCountFrequency (%)
46921
0.1%
32281
0.1%
31381
0.1%
28981
0.1%
26331
0.1%
25241
0.1%
25151
0.1%
24441
0.1%
24111
0.1%
24021
0.1%

2ndFlrSF
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct417
Distinct (%)28.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean346.9924658
Minimum0
Maximum2065
Zeros829
Zeros (%)56.8%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:57.034178image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3728
95-th percentile1141.05
Maximum2065
Range2065
Interquartile range (IQR)728

Descriptive statistics

Standard deviation436.5284359
Coefficient of variation (CV)1.258034335
Kurtosis-0.5534635576
Mean346.9924658
Median Absolute Deviation (MAD)0
Skewness0.8130298163
Sum506609
Variance190557.0753
MonotonicityNot monotonic
2022-04-23T22:13:57.135684image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0829
56.8%
72810
 
0.7%
5049
 
0.6%
5468
 
0.5%
6728
 
0.5%
6007
 
0.5%
7207
 
0.5%
8966
 
0.4%
8625
 
0.3%
7805
 
0.3%
Other values (407)566
38.8%
ValueCountFrequency (%)
0829
56.8%
1101
 
0.1%
1671
 
0.1%
1921
 
0.1%
2081
 
0.1%
2131
 
0.1%
2201
 
0.1%
2241
 
0.1%
2402
 
0.1%
2522
 
0.1%
ValueCountFrequency (%)
20651
0.1%
18721
0.1%
18181
0.1%
17961
0.1%
16111
0.1%
15891
0.1%
15401
0.1%
15381
0.1%
15231
0.1%
15191
0.1%

GrLivArea
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct861
Distinct (%)59.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1515.463699
Minimum334
Maximum5642
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:57.234559image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum334
5-th percentile848
Q11129.5
median1464
Q31776.75
95-th percentile2466.1
Maximum5642
Range5308
Interquartile range (IQR)647.25

Descriptive statistics

Standard deviation525.4803834
Coefficient of variation (CV)0.3467456092
Kurtosis4.895120581
Mean1515.463699
Median Absolute Deviation (MAD)326
Skewness1.366560356
Sum2212577
Variance276129.6334
MonotonicityNot monotonic
2022-04-23T22:13:57.334456image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
86422
 
1.5%
104014
 
1.0%
89411
 
0.8%
145610
 
0.7%
84810
 
0.7%
12009
 
0.6%
9129
 
0.6%
8168
 
0.5%
10928
 
0.5%
17287
 
0.5%
Other values (851)1352
92.6%
ValueCountFrequency (%)
3341
 
0.1%
4381
 
0.1%
4801
 
0.1%
5201
 
0.1%
6051
 
0.1%
6161
 
0.1%
6306
0.4%
6722
 
0.1%
6911
 
0.1%
6931
 
0.1%
ValueCountFrequency (%)
56421
0.1%
46761
0.1%
44761
0.1%
43161
0.1%
36271
0.1%
36081
0.1%
34931
0.1%
34471
0.1%
33951
0.1%
32791
0.1%

FullBath
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.565068493
Minimum0
Maximum3
Zeros9
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:57.411564image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q32
95-th percentile2
Maximum3
Range3
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.5509158013
Coefficient of variation (CV)0.3520074704
Kurtosis-0.8570428213
Mean1.565068493
Median Absolute Deviation (MAD)0
Skewness0.0365615584
Sum2285
Variance0.3035082201
MonotonicityNot monotonic
2022-04-23T22:13:57.473190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=4)
ValueCountFrequency (%)
2768
52.6%
1650
44.5%
333
 
2.3%
09
 
0.6%
ValueCountFrequency (%)
09
 
0.6%
1650
44.5%
2768
52.6%
333
 
2.3%
ValueCountFrequency (%)
333
 
2.3%
2768
52.6%
1650
44.5%
09
 
0.6%

KitchenQual
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size84.2 KiB
TA
735 
Gd
586 
Ex
100 
Fa
 
39

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGd
2nd rowTA
3rd rowGd
4th rowGd
5th rowGd

Common Values

ValueCountFrequency (%)
TA735
50.3%
Gd586
40.1%
Ex100
 
6.8%
Fa39
 
2.7%

Length

2022-04-23T22:13:57.768876image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-23T22:13:57.819394image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ta735
50.3%
gd586
40.1%
ex100
 
6.8%
fa39
 
2.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TotRmsAbvGrd
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.517808219
Minimum2
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:57.867131image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile4
Q15
median6
Q37
95-th percentile10
Maximum14
Range12
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.625393291
Coefficient of variation (CV)0.2493772808
Kurtosis0.8807615657
Mean6.517808219
Median Absolute Deviation (MAD)1
Skewness0.6763408364
Sum9516
Variance2.641903349
MonotonicityNot monotonic
2022-04-23T22:13:57.941164image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
6402
27.5%
7329
22.5%
5275
18.8%
8187
12.8%
497
 
6.6%
975
 
5.1%
1047
 
3.2%
1118
 
1.2%
317
 
1.2%
1211
 
0.8%
Other values (2)2
 
0.1%
ValueCountFrequency (%)
21
 
0.1%
317
 
1.2%
497
 
6.6%
5275
18.8%
6402
27.5%
7329
22.5%
8187
12.8%
975
 
5.1%
1047
 
3.2%
1118
 
1.2%
ValueCountFrequency (%)
141
 
0.1%
1211
 
0.8%
1118
 
1.2%
1047
 
3.2%
975
 
5.1%
8187
12.8%
7329
22.5%
6402
27.5%
5275
18.8%
497
 
6.6%

GarageYrBlt
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct97
Distinct (%)7.0%
Missing81
Missing (%)5.5%
Infinite0
Infinite (%)0.0%
Mean1978.506164
Minimum1900
Maximum2010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:58.033322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1900
5-th percentile1930
Q11961
median1980
Q32002
95-th percentile2007
Maximum2010
Range110
Interquartile range (IQR)41

Descriptive statistics

Standard deviation24.68972477
Coefficient of variation (CV)0.01247897288
Kurtosis-0.418340998
Mean1978.506164
Median Absolute Deviation (MAD)21
Skewness-0.6494146239
Sum2728360
Variance609.5825091
MonotonicityNot monotonic
2022-04-23T22:13:58.131393image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
200565
 
4.5%
200659
 
4.0%
200453
 
3.6%
200350
 
3.4%
200749
 
3.4%
197735
 
2.4%
199831
 
2.1%
199930
 
2.1%
197629
 
2.0%
200829
 
2.0%
Other values (87)949
65.0%
(Missing)81
 
5.5%
ValueCountFrequency (%)
19001
 
0.1%
19061
 
0.1%
19081
 
0.1%
19103
 
0.2%
19142
 
0.1%
19152
 
0.1%
19165
 
0.3%
19182
 
0.1%
192014
1.0%
19213
 
0.2%
ValueCountFrequency (%)
20103
 
0.2%
200921
 
1.4%
200829
2.0%
200749
3.4%
200659
4.0%
200565
4.5%
200453
3.6%
200350
3.4%
200226
 
1.8%
200120
 
1.4%

GarageFinish
Categorical

HIGH CORRELATION
MISSING

Distinct3
Distinct (%)0.2%
Missing81
Missing (%)5.5%
Memory size83.5 KiB
Unf
605 
RFn
422 
Fin
352 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRFn
2nd rowRFn
3rd rowRFn
4th rowUnf
5th rowRFn

Common Values

ValueCountFrequency (%)
Unf605
41.4%
RFn422
28.9%
Fin352
24.1%
(Missing)81
 
5.5%

Length

2022-04-23T22:13:58.219128image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-04-23T22:13:58.268209image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
unf605
43.9%
rfn422
30.6%
fin352
25.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

GarageCars
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct5
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.767123288
Minimum0
Maximum4
Zeros81
Zeros (%)5.5%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:58.312848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q32
95-th percentile3
Maximum4
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7473150101
Coefficient of variation (CV)0.4228991918
Kurtosis0.220997764
Mean1.767123288
Median Absolute Deviation (MAD)0
Skewness-0.3425489297
Sum2580
Variance0.5584797243
MonotonicityNot monotonic
2022-04-23T22:13:58.379059image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%)
2824
56.4%
1369
25.3%
3181
 
12.4%
081
 
5.5%
45
 
0.3%
ValueCountFrequency (%)
081
 
5.5%
1369
25.3%
2824
56.4%
3181
 
12.4%
45
 
0.3%
ValueCountFrequency (%)
45
 
0.3%
3181
 
12.4%
2824
56.4%
1369
25.3%
081
 
5.5%

GarageArea
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct441
Distinct (%)30.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean472.980137
Minimum0
Maximum1418
Zeros81
Zeros (%)5.5%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:58.469958image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1334.5
median480
Q3576
95-th percentile850.1
Maximum1418
Range1418
Interquartile range (IQR)241.5

Descriptive statistics

Standard deviation213.8048415
Coefficient of variation (CV)0.452037675
Kurtosis0.9170672023
Mean472.980137
Median Absolute Deviation (MAD)120
Skewness0.1799809067
Sum690551
Variance45712.51023
MonotonicityNot monotonic
2022-04-23T22:13:58.564679image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
081
 
5.5%
44049
 
3.4%
57647
 
3.2%
24038
 
2.6%
48434
 
2.3%
52833
 
2.3%
28827
 
1.8%
40025
 
1.7%
26424
 
1.6%
48024
 
1.6%
Other values (431)1078
73.8%
ValueCountFrequency (%)
081
5.5%
1602
 
0.1%
1641
 
0.1%
1809
 
0.6%
1861
 
0.1%
1891
 
0.1%
1921
 
0.1%
1981
 
0.1%
2004
 
0.3%
2053
 
0.2%
ValueCountFrequency (%)
14181
0.1%
13901
0.1%
13561
0.1%
12481
0.1%
12201
0.1%
11661
0.1%
11341
0.1%
10691
0.1%
10531
0.1%
10522
0.1%

SalePrice
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct663
Distinct (%)45.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean180921.1959
Minimum34900
Maximum755000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.5 KiB
2022-04-23T22:13:58.668297image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum34900
5-th percentile88000
Q1129975
median163000
Q3214000
95-th percentile326100
Maximum755000
Range720100
Interquartile range (IQR)84025

Descriptive statistics

Standard deviation79442.50288
Coefficient of variation (CV)0.4391000319
Kurtosis6.53628186
Mean180921.1959
Median Absolute Deviation (MAD)38000
Skewness1.88287576
Sum264144946
Variance6311111264
MonotonicityNot monotonic
2022-04-23T22:13:58.766717image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14000020
 
1.4%
13500017
 
1.2%
15500014
 
1.0%
14500014
 
1.0%
19000013
 
0.9%
11000013
 
0.9%
11500012
 
0.8%
16000012
 
0.8%
13000011
 
0.8%
13900011
 
0.8%
Other values (653)1323
90.6%
ValueCountFrequency (%)
349001
0.1%
353111
0.1%
379001
0.1%
393001
0.1%
400001
0.1%
520001
0.1%
525001
0.1%
550002
0.1%
559931
0.1%
585001
0.1%
ValueCountFrequency (%)
7550001
0.1%
7450001
0.1%
6250001
0.1%
6116571
0.1%
5829331
0.1%
5565811
0.1%
5550001
0.1%
5380001
0.1%
5018371
0.1%
4850001
0.1%

Interactions

2022-04-23T22:13:53.528350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:36.866507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.138463image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:39.366539image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.756872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.959959image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.291015image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.469383image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.635614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.849486image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.212533image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.453776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.662663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:52.122620image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:53.611479image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.004232image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.222935image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:39.452202image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.847359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.042396image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.367078image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.547341image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.714974image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.940222image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.295270image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.546658image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.752213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:52.204279image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:53.701031image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.092060image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.310406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:39.567388image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.934538image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.130545image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.449347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.631390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.809311image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:47.028825image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.386617image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.633075image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.858276image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:52.290021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:53.793312image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.183016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.404677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:39.768327image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.023390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.221966image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.532819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.718817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.909765image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:47.118796image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.478325image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.723893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.963449image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:52.382276image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:53.879680image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.264030image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.490566image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:39.857517image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.103811image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.306916image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.609606image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.800184image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.992282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:47.347086image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.562599image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.809250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:51.106351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:52.632492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:53.967697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.349099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.578764image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:39.948807image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.185626image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.393838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.690206image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.891218image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.075136image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:47.432140image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.647668image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.899366image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:51.282639image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:52.732711image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:54.053437image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.427405image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.658419image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.031627image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.264303image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.473881image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.761217image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.964818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.152918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:47.510518image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.757697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.980281image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:51.405108image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:52.811987image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:54.135386image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.504591image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.741993image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.114619image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.345481image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.555553image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.847581image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.041023image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.232897image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:47.590737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.837833image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.062445image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:51.488102image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:52.896115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:54.220018image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.586919image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.824007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.201318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.425941image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.636774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.926318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.120292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.316058image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:47.670746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.919387image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.144936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:51.573813image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:52.981213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:54.313316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.680176image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.912492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.295622image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.515393image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.723442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.006205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.205212image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.403525image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:47.758993image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.005885image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.235638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:51.665982image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:53.079709image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:54.401588image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.766625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:39.000110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.387565image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.603012image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.810864image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.089500image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.288842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.493418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:47.850939image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.090894image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.317039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:51.756025image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:53.168885image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:54.483200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.852483image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:39.085854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.476339image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.691676image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:42.897289image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.167548image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.370213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.577530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:47.936758image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.172086image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.401031image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:51.841553image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:53.255014image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:54.574319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:37.942370image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:39.178811image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.571236image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.785134image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.111215image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.290556image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.458160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.669962image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.030130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.272663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.487550image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:51.938516image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:53.349970image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:54.661333image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:38.040962image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:39.274106image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:40.663851image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:41.873446image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:43.202221image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:44.373801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:45.549539image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:46.755315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:48.120152image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:49.363899image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:50.574298image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:52.028915image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-04-23T22:13:53.437977image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-04-23T22:13:58.860547image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-23T22:13:58.990949image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-23T22:13:59.119509image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-23T22:13:59.242952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-04-23T22:13:59.348594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-04-23T22:13:54.820747image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-04-23T22:13:55.073198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-04-23T22:13:55.200771image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-04-23T22:13:55.275027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

NeighborhoodOverallQualYearBuiltMasVnrAreaExterQualBsmtQualBsmtFinSF1TotalBsmtSFHeatingQCCentralAir1stFlrSF2ndFlrSFGrLivAreaFullBathKitchenQualTotRmsAbvGrdGarageYrBltGarageFinishGarageCarsGarageAreaSalePrice
0CollgCr72003196.0GdGd706856ExY85685417102Gd82003.0RFn2548208500
1Veenker619760.0TAGd9781262ExY1262012622TA61976.0RFn2460181500
2CollgCr72001162.0GdGd486920ExY92086617862Gd62001.0RFn2608223500
3Crawfor719150.0TATA216756GdY96175617171Gd71998.0Unf3642140000
4NoRidge82000350.0GdGd6551145ExY1145105321982Gd92000.0RFn3836250000
5Mitchel519930.0TAGd732796ExY79656613621TA51993.0Unf2480143000
6Somerst82004186.0GdEx13691686ExY1694016942Gd72004.0RFn2636307000
7NWAmes71973240.0TAGd8591107ExY110798320902TA71973.0RFn2484200000
8OldTown719310.0TATA0952GdY102275217742TA81931.0Unf2468129900
9BrkSide519390.0TATA851991ExY1077010771TA51939.0RFn1205118000

Last rows

NeighborhoodOverallQualYearBuiltMasVnrAreaExterQualBsmtQualBsmtFinSF1TotalBsmtSFHeatingQCCentralAir1stFlrSF2ndFlrSFGrLivAreaFullBathKitchenQualTotRmsAbvGrdGarageYrBltGarageFinishGarageCarsGarageAreaSalePrice
1450NAmes519740.0TAGd0896TAY89689617922TA8NaNNone00136000
1451Somerst82008194.0GdGd01573ExY1578015782Ex72008.0Fin3840287090
1452Edwards5200580.0TAGd547547GdY1072010721TA52005.0Fin2525145000
1453Mitchel520060.0TAGd01140ExY1140011401TA6NaNNone0084500
1454Somerst720040.0GdGd4101221ExY1221012212Gd62004.0RFn2400185000
1455Gilbert619990.0TAGd0953ExY95369416472TA71999.0RFn2460175000
1456NWAmes61978119.0TAGd7901542TAY2073020732TA71978.0Unf2500210000
1457Crawfor719410.0ExTA2751152ExY1188115223402Gd91941.0RFn1252266500
1458NAmes519500.0TATA491078GdY1078010781Gd51950.0Unf1240142125
1459Edwards519650.0GdTA8301256GdY1256012561TA61965.0Fin1276147500